The effects of pruning methods on the predictive accuracy of induced decision trees

Decision trees are examples of easily interpretable models whose predictive accuracy is normally low. In comparison, decision tree ensembles (DTEs) such as random forest (RF) exhibit high predictive accuracy while being regarded as black-box models. We propose three new rule extraction algorithms from DTEs. The RF[Formula: see text]DHC method, a hill climbing method with downhill moves (DHC), is used to search for a rule set that decreases the number of rules dramatically. In the RF[Formula: see text]SGL and RF[Formula: see text]MSGL methods, the sparse group lasso (SGL) method, and the multiclass SGL (MSGL) method are employed respectively to find a sparse weight vector corresponding to the rules generated by RF. Experimental results with 24 data sets show that the proposed methods outperform similar state-of-the-art methods, in terms of human comprehensibility, by greatly reducing the number of rules and limiting the number of antecedents in the retained rules, while preserving the same level of accuracy.

Download Full-text

Combination classification method for customer relationship management

Asia Pacific Journal of Marketing and Logistics ◽

10.1108/apjml-03-2019-0125 ◽

2019 ◽

Vol 32 (5) ◽

pp. 1004-1022

Author(s):

Zhe Zhang ◽

Yue Dai

Keyword(s):

Genetic Algorithm ◽

Decision Trees ◽

Customer Relationship Management ◽

Predictive Accuracy ◽

Behavior Pattern ◽

Relationship Management ◽

Customer Relationship ◽

Combination Method ◽

Classification Problems ◽

Content Type

Purpose For classification problems of customer relationship management (CRM), the purpose of this paper is to propose a method with interpretability of the classification results that combines multiple decision trees based on a genetic algorithm. Design/methodology/approach In the proposed method, multiple decision trees are combined in parallel. Subsequently, a genetic algorithm is used to optimize the weight matrix in the combination algorithm. Findings The method is applied to customer credit rating assessment and customer response behavior pattern recognition. The results demonstrate that compared to a single decision tree, the proposed combination method improves the predictive accuracy and optimizes the classification rules, while maintaining interpretability of the classification results. Originality/value The findings of this study contribute to research methodologies in CRM. It specifically focuses on a new method with interpretability by combining multiple decision trees based on genetic algorithms for customer classification.

Download Full-text

Further Experimental Evidence against the Utility of Occam's Razor

Journal of Artificial Intelligence Research ◽

10.1613/jair.228 ◽

1996 ◽

Vol 4 ◽

pp. 397-417 ◽

Cited By ~ 45

Author(s):

G. I. Webb

Keyword(s):

Experimental Evidence ◽

Decision Trees ◽

Predictive Accuracy ◽

Training Data ◽

Learning Tasks ◽

Occam’S Razor ◽

Complex Decision ◽

Common Learning ◽

Considerable Doubt ◽

Occam's Razor

This paper presents new experimental evidence against the utility of Occam's razor. A~systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting more complex decision trees are demonstrated to have, on average, for a variety of common learning tasks, higher predictive accuracy than the less complex original decision trees. This result raises considerable doubt about the utility of Occam's razor as it is commonly applied in modern machine learning.

Download Full-text

Applying Informatics in Tissue Engineering

Methods of Information in Medicine ◽

10.1055/s-0038-1633921 ◽

2005 ◽

Vol 44 (01) ◽

pp. 38-43 ◽

Cited By ~ 1

Author(s):

Xiaolin Zhou ◽

Daping Yang ◽

Haiyan Ge ◽

Qi Wang ◽

Kang Tu ◽

...

Keyword(s):

Neural Networks ◽

Tissue Engineering ◽

Artificial Neural Networks ◽

Decision Trees ◽

Predictive Accuracy ◽

Animal Experiments ◽

Design Algorithm ◽

Artificial Neural ◽

Strategy Design ◽

Informatics Tools

Summary Objective: To facilitate tissue engineering strategies determination with informatics tools. Methods: Firstly, tissue engineering experimental data were standardized and integrated into a centralized database; secondly, we used data mining tools (e.g. artificial neural networks and decision trees) to predict the outcomes of tissue engineering strategies; thirdly, a strategy design algorithm was developed, and its efficacy was validated with animal experiments; lastly, we constructed an online database and a decision support system for tissue engineering. Results: The artificial neural networks and the decision trees respectively predicted the outcomes of tissue engineering strategies with the predictive accuracy of 95.14% and 85.26%. Following the strategies generated by computer, we cured 18 of the 20 experimental animals with a significantly lower cost than usual. Conclusion: Informatics is beneficial for realizing safe, effective and economical tissue engineering.

Download Full-text

Pruning trees in C-fuzzy random forest

Soft Computing ◽

10.1007/s00500-020-05270-3 ◽

2020 ◽

Author(s):

Łukasz Gadomer ◽

Zenon A. Sosnowski

Keyword(s):

Random Forest ◽

Decision Trees ◽

Classification Accuracy ◽

Computation Time ◽

Fuzzy Decision ◽

Pruning Method ◽

Fuzzy Decision Trees ◽

The Given ◽

Pruning Methods ◽

Fuzzy Random

Abstract Pruning decision trees is the way to decrease their size in order to reduce classification time and improve (or at least maintain) classification accuracy. In this paper, the idea of applying different pruning methods to C-fuzzy decision trees and Cluster–context fuzzy decision trees in C-fuzzy random forest is presented. C-fuzzy random forest is a classifier which we created and we are improving. This solution is based on fuzzy random forest and uses C-fuzzy decision trees or Cluster–context fuzzy decision trees—depending on the variant. Five pruning methods were adjusted to mentioned kind of trees and examined: Reduced Error Pruning (REP), Pessimistic Error Pruning (PEP), Minimum Error Pruning (MEP), Critical Value Pruning (CVP) and Cost-Complexity Pruning. C-fuzzy random forests with unpruned trees and trees constructed using each of these pruning methods were created. The evaluation of created forests was performed on eleven discrete decision class datasets (forest with C-fuzzy decision trees) and two continuous decision class datasets (forest with Cluster–context fuzzy decision trees). The experiments on eleven different discrete decision class datasets and two continuous decision class datasets were performed to evaluate five implemented pruning methods. Our experiments show that pruning trees in C-fuzzy random forest in general reduce computation time and improve classification accuracy. Generalizing, the best classification accuracy improvement was achieved using CVP for discrete decision class problems and REP for continuous decision class datasets, but for each dataset different pruning methods work well. The method which pruned trees the most was PEP and the fastest one was MEP. However, there is no pruning method which fits the best for all datasets—the pruning method should be chosen individually according to the given problem. There are also situations where it is better to remain trees unpruned.

Download Full-text

Constraint Enforcement on Decision Trees: a Survey

ACM Computing Surveys ◽

10.1145/3506734 ◽

2022 ◽

Author(s):

Géraldin Nanfack ◽

Paul Temple ◽

Benoît Frénay

Keyword(s):

Decision Trees ◽

Medical Diagnosis ◽

Predictive Accuracy ◽

State Of The Art ◽

Computational Time ◽

Future Research ◽

Research Directions ◽

Future Research Directions ◽

Constraint Enforcement ◽

Machine Learning Models

Decision trees have the particularity of being machine learning models that are visually easy to interpret and understand. Therefore, they are primarily suited for sensitive domains like medical diagnosis, where decisions need to be explainable. However, if used on complex problems, decision trees can become large, making them hard to grasp. In addition to this aspect, when learning decision trees, it may be necessary to consider a broader class of constraints, such as the fact that two variables should not be used in a single branch of the tree. This motivates the need to enforce constraints in learning algorithms of decision trees. We propose a survey of works that attempted to solve the problem of learning decision trees under constraints. Our contributions are fourfold. First, to the best of our knowledge, this is the first survey that deals with constraints on decision trees. Second, we define a flexible taxonomy of constraints applied to decision trees and methods for their treatment in the literature. Third, we benchmark state-of-the art depth-constrained decision tree learners with respect to predictive accuracy and computational time. Fourth, we discuss potential future research directions that would be of interest for researchers who wish to conduct research in this field.

Download Full-text

Practical Federated Gradient Boosting Decision Trees

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5895 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4642-4649 ◽

Cited By ~ 1

Author(s):

Qinbin Li ◽

Zeyi Wen ◽

Bingsheng He

Keyword(s):

Decision Trees ◽

Differential Privacy ◽

Predictive Accuracy ◽

Homomorphic Encryption ◽

Experimental Studies ◽

Locality Sensitive Hashing ◽

Gradient Boosting ◽

Local Data ◽

Original Record ◽

Privacy Constraints

Gradient Boosting Decision Trees (GBDTs) have become very successful in recent years, with many awards in machine learning and data mining competitions. There have been several recent studies on how to train GBDTs in the federated learning setting. In this paper, we focus on horizontal federated learning, where data samples with the same features are distributed among multiple parties. However, existing studies are not efficient or effective enough for practical use. They suffer either from the inefficiency due to the usage of costly data transformations such as secure sharing and homomorphic encryption, or from the low model accuracy due to differential privacy designs. In this paper, we study a practical federated environment with relaxed privacy constraints. In this environment, a dishonest party might obtain some information about the other parties' data, but it is still impossible for the dishonest party to derive the actual raw data of other parties. Specifically, each party boosts a number of trees by exploiting similarity information based on locality-sensitive hashing. We prove that our framework is secure without exposing the original record to other parties, while the computation overhead in the training process is kept low. Our experimental studies show that, compared with normal training with the local data of each party, our approach can significantly improve the predictive accuracy, and achieve comparable accuracy to the original GBDT with the data from all parties.

Download Full-text

Performance of Various Machine Learning Classifiers on Small Datasets with Varying Dimensionalities: A Study

Circulation in Computer Science ◽

10.22632/ccs-2016-251-23 ◽

2016 ◽

Vol 1 (1) ◽

pp. 30-35 ◽

Cited By ~ 3

Author(s):

Sahil Sharma ◽

Vinod Sharma

Keyword(s):

Machine Learning ◽

Decision Trees ◽

Supervised Learning ◽

Predictive Accuracy ◽

Ensemble Method ◽

Reduced Dimensionality ◽

Linear Discriminant ◽

Machine Learning Classifiers ◽

Learning Technique ◽

Better Than

Classification is an important supervised learning technique that is used by many applications. An important factor on which the performance of a classifier depends is the size of the dataset using which the classifier is going to be trained. In this manuscript the authors analyzed five different classification techniques (namely decision trees, KNN, SVM, linear discriminant and Ensemble method) in terms of AUC and predictive accuracy when trained using small datasets with different dimensionalities. The study was done using a dataset with 24 features and 400 instances (samples). The results showed that in general ensemble method (using boosted trees) performed better than others but its performance degraded a bit with reduced dimensionality.

Download Full-text